A framework that supports in writing performance-optimized stencil-based codes

نویسندگان

  • Markus Stürmer
  • Ulrich Rüde
  • M. Stürmer
  • U. Rüde
چکیده

On modern multicore processors, many applications run much slower than one would expect when looking at the vast congregated computational power. After a discussion of factors determining the performance on such CPUs, a framework concept that simplifies writing performance-optimized codes for stencil-based algorithms and a prototypical implementation are presented. Finally, the suitability of this approach is discussed by analyzing performance results for three different applications which have been optimized within that framework.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Writing productive stencil codes with overlapped tiling ‡ 3

Stencil computations constitute the kernel of many scientific applications. Tiling is often used to improve 11 the performance of stencil codes for data locality and parallelism. However, tiled stencil codes typically require shadow regions, whose management becomes a burden to programmers. In fact, it is often the 13 case that the code required to manage these regions, and in particular their ...

متن کامل

A Stencil DSEL for Single Code Accelerated Computing with SYCL

Stencil kernels arise in many scientific codes as the result from discretizing natural, continuous phenomenons. Many research works have designed stencil frameworks to help programmer optimize stencil kernels for performance, and to target CPUs or accelerators. However, existing stencil kernels, either library-based or languagebased necessitate to write distinct source codes for accelerated ker...

متن کامل

A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels

Despite extensive research, optimal performance has not easily been available previously for matrix multiplication (especially for large matrices) on most architectures because of the lack of a structured approach and the limitations imposed by matrix storage formats. A simple but effective framework is presented here that lays the foundation for building high-performance matrix-multiplication ...

متن کامل

Perceptual Learning Style Preferences and Computer-Assisted Writing Achievement within the Activity Theory Framework

Learning styles are considered among the significant factors that aid instructors in deciding how well their students learn a second or foreign language (Oxford, 2003). Although this issue has been accepted broadly in educational psychology,further research is required to examine the relationship between learning styles and language learning skills. Thus, the present study was carried out to in...

متن کامل

Auto-tuning Stencil Codes for Cache-Based Multicore Platforms

Auto-tuning Stencil Codes for Cache-Based Multicore Platforms by Kaushik Datta Doctor of Philosophy in Computer Science University of California, Berkeley Professor Katherine A. Yelick, Chair As clock frequencies have tapered off and the number of cores on a chip has taken off, the challenge of effectively utilizing these multicore systems has become increasingly important. However, the diversi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010